Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

New statistical method for machine-printed Arabic character recognition

Identifieur interne : 001387 ( Main/Exploration ); précédent : 001386; suivant : 001388

New statistical method for machine-printed Arabic character recognition

Auteurs : HUA WANG [République populaire de Chine] ; XIAOQING DING [République populaire de Chine] ; JIANMING JIN [République populaire de Chine] ; HALMURAT [République populaire de Chine]

Source :

RBID : Pascal:05-0360134

Descripteurs français

English descriptors

Abstract

Although about 300 million people worldwide, in several different languages, take Arabic characters for writing, Arabic OCR has not been researched as thoroughly as other widely used characters (Latin or Chinese). In this paper, a new statistical method is developed to recognize machine-printed Arabic characters. Firstly, the entire Arabic character set is pre-classified into 32 sub-sets in terms of character forms, special zones that characters occupy and component information. Then directional features are extracted based on which modified quadratic discriminant function (MQDF) is utilized as classifier to deal with classification task. Finally, similar characters are discriminated before outputting recognition results. Encouraging experimental results on test sets show the validity of proposed method.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">New statistical method for machine-printed Arabic character recognition</title>
<author>
<name sortKey="Hua Wang" sort="Hua Wang" uniqKey="Hua Wang" last="Hua Wang">HUA WANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Xiaoqing Ding" sort="Xiaoqing Ding" uniqKey="Xiaoqing Ding" last="Xiaoqing Ding">XIAOQING DING</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Jianming Jin" sort="Jianming Jin" uniqKey="Jianming Jin" last="Jianming Jin">JIANMING JIN</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Halmurat" sort="Halmurat" uniqKey="Halmurat" last="Halmurat">HALMURAT</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s1>School of Information, Xinjiang Univ</s1>
<s2>Urumqi 830046</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<wicri:noRegion>Urumqi 830046</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">05-0360134</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 05-0360134 INIST</idno>
<idno type="RBID">Pascal:05-0360134</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000462</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000326</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000395</idno>
<idno type="wicri:doubleKey">1017-2653:2005:Hua Wang:new:statistical:method</idno>
<idno type="wicri:Area/Main/Merge">001425</idno>
<idno type="wicri:Area/Main/Curation">001387</idno>
<idno type="wicri:Area/Main/Exploration">001387</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">New statistical method for machine-printed Arabic character recognition</title>
<author>
<name sortKey="Hua Wang" sort="Hua Wang" uniqKey="Hua Wang" last="Hua Wang">HUA WANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Xiaoqing Ding" sort="Xiaoqing Ding" uniqKey="Xiaoqing Ding" last="Xiaoqing Ding">XIAOQING DING</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Jianming Jin" sort="Jianming Jin" uniqKey="Jianming Jin" last="Jianming Jin">JIANMING JIN</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Image Division, Dept. of Electronic Engineering, Tsinghua Univ</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Halmurat" sort="Halmurat" uniqKey="Halmurat" last="Halmurat">HALMURAT</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s1>School of Information, Xinjiang Univ</s1>
<s2>Urumqi 830046</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<wicri:noRegion>Urumqi 830046</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint>
<date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Arabic</term>
<term>Automatic classification</term>
<term>Character recognition</term>
<term>Character set</term>
<term>Chinese</term>
<term>Discriminant function</term>
<term>Feature extraction</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Printed character</term>
<term>Quadratic function</term>
<term>Signal classification</term>
<term>Signal processing</term>
<term>Statistical method</term>
<term>Testing equipment</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Méthode statistique</term>
<term>Caractère imprimé</term>
<term>Arabe</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Chinois</term>
<term>Jeu caractère</term>
<term>Extraction caractéristique</term>
<term>Fonction quadratique</term>
<term>Fonction discriminante</term>
<term>Classification automatique</term>
<term>Appareillage essai</term>
<term>Reconnaissance forme</term>
<term>Traitement signal</term>
<term>Classification signal</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Méthode statistique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Although about 300 million people worldwide, in several different languages, take Arabic characters for writing, Arabic OCR has not been researched as thoroughly as other widely used characters (Latin or Chinese). In this paper, a new statistical method is developed to recognize machine-printed Arabic characters. Firstly, the entire Arabic character set is pre-classified into 32 sub-sets in terms of character forms, special zones that characters occupy and component information. Then directional features are extracted based on which modified quadratic discriminant function (MQDF) is utilized as classifier to deal with classification task. Finally, similar characters are discriminated before outputting recognition results. Encouraging experimental results on test sets show the validity of proposed method.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>République populaire de Chine</li>
</country>
<settlement>
<li>Pékin</li>
</settlement>
</list>
<tree>
<country name="République populaire de Chine">
<noRegion>
<name sortKey="Hua Wang" sort="Hua Wang" uniqKey="Hua Wang" last="Hua Wang">HUA WANG</name>
</noRegion>
<name sortKey="Halmurat" sort="Halmurat" uniqKey="Halmurat" last="Halmurat">HALMURAT</name>
<name sortKey="Jianming Jin" sort="Jianming Jin" uniqKey="Jianming Jin" last="Jianming Jin">JIANMING JIN</name>
<name sortKey="Xiaoqing Ding" sort="Xiaoqing Ding" uniqKey="Xiaoqing Ding" last="Xiaoqing Ding">XIAOQING DING</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001387 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001387 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:05-0360134
   |texte=   New statistical method for machine-printed Arabic character recognition
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024